AITopics | data matrix

We propose efficient algorithms for simultaneous clustering and completion of incomplete high-dimensional data that lie in a union of low-dimensional subspaces. We cast the problem as finding a completion of the data matrix so that each point can be reconstructed as a linear or affine combination of a few data points. Since the problem is NP-hard, we propose a lifting framework and reformulate the problem as a group-sparse recovery of each incomplete data point in a dictionary built using incomplete data, subject to rank-one constraints. To solve the problem efficiently, we propose a rank pursuit algorithm and a convex relaxation. The solution of our algorithms recover missing entries and provides a similarity matrix for clustering. Our algorithms can deal with both low-rank and high-rank matrices, does not suffer from initialization, does not need to know dimensions of subspaces and can work with a small number of data points. By extensive experiments on synthetic data and real problems of video motion segmentation and completion of motion capture data, we show that when the data matrix is low-rank, our algorithm performs on par with or better than low-rank matrix completion methods, while for high-rank data matrices, our method significantly outperforms existing algorithms.

artificial intelligence, machine learning, subspace, (17 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Vision (0.89)

Add feedback

Data Poisoning Attacks on Factorization-Based Collaborative Filtering

Bo Li, Yining Wang, Aarti Singh, Yevgeniy Vorobeychik

Neural Information Processing SystemsMar-23-2026, 14:49:27 GMT

Recommendation and collaborative filtering systems are important in modern information and e-commerce applications. As these systems are becoming increasingly popular in the industry, their outputs could affect business decision making, introducing incentives for an adversarial party to compromise the availability or integrity of such systems. We introduce a data poisoning attack on collaborative filtering systems. We demonstrate how a powerful attacker with full knowledge of the learner can generate malicious data so as to maximize his/her malicious objectives, while at the same time mimicking normal user behavior to avoid being detected. While the complete knowledge assumption seems extreme, it enables a robust assessment of the vulnerability of collaborative filtering schemes to highly motivated attacks.

artificial intelligence, attacker, machine learning, (15 more...)

Neural Information Processing Systems

Genre: Research Report (0.94)

Industry:

Information Technology > Security & Privacy (0.47)
Information Technology > Services > e-Commerce Services (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.95)

Add feedback

Robust Spectral Detection of Global Structures in the Data by Learning a Regularization

Pan Zhang

Neural Information Processing SystemsMar-23-2026, 06:18:19 GMT

Spectral methods are popular in detecting global structures in the given data that can be represented as a matrix. However when the data matrix is sparse or noisy, classic spectral methods usually fail to work, due to localization of eigenvectors (or singular vectors) induced by the sparsity or noise. In this work, we propose a general method to solve the localization problem by learning a regularization matrix from the localized eigenvectors. Using matrix perturbation analysis, we demonstrate that the learned regularizations suppress down the eigenvalues associated with localized eigenvectors and enable us to recover the informative eigenvectors representing the global structure. We show applications of our method in several inference problems: community detection in networks, clustering from pairwise similarities, rank estimation and matrix completion problems. Using extensive experiments, we illustrate that our method solves the localization problem and works down to the theoretical detectability limits in different kinds of synthetic data. This is in contrast with existing spectral algorithms based on data matrix, non-backtracking matrix, Laplacians and those with rank-one regularizations, which perform poorly in the sparse case with noise.

artificial intelligence, data mining, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States (0.28)
Asia (0.28)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science > Data Mining (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Robust Spectral Detection of Global Structures in the Data by Learning a Regularization

Neural Information Processing SystemsMar-17-2026, 07:54:13 GMT

Spectral methods are popular in detecting global structures in the given data that can be represented as a matrix. However when the data matrix is sparse or noisy, classic spectral methods usually fail to work, due to localization of eigenvectors (or singular vectors) induced by the sparsity or noise. In this work, we propose a general method to solve the localization problem by learning a regularization matrix from the localized eigenvectors. Using matrix perturbation analysis, we demonstrate that the learned regularizations suppress down the eigenvalues associated with localized eigenvectors and enable us to recover the informative eigenvectors representing the global structure. We show applications of our method in several inference problems: community detection in networks, clustering from pairwise similarities, rank estimation and matrix completion problems. Using extensive experiments, we illustrate that our method solves the localization problem and works down to the theoretical detectability limits in different kinds of synthetic data. This is in contrast with existing spectral algorithms based on data matrix, non-backtracking matrix, Laplacians and those with rank-one regularizations, which perform poorly in the sparse case with noise.

artificial intelligence, machine learning, proceedings, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.64)

Add feedback

Learning A Structured Optimal Bipartite Graph for Co-Clustering

Feiping Nie, Xiaoqian Wang, Cheng Deng, Heng Huang

Neural Information Processing SystemsFeb-18-2026, 21:25:51 GMT

Ingraph based co-clustering methods, abipartite graph isconstructed todepict the relation between features and samples.

artificial intelligence, machine learning, theproblem, (17 more...)

Neural Information Processing Systems

Country:

North America > United States (0.05)
Asia > China (0.05)
Asia > Middle East > Jordan (0.04)

Industry: Health & Medicine (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.37)

Add feedback

51425b752a0b402ed3effc83fc4bbb74-Paper.pdf

Neural Information Processing SystemsFeb-12-2026, 04:17:40 GMT

gaussian, matrix, sketched program, (16 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > Canada (0.04)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

fe2b421b8b5f0e7c355ace66a9fe0206-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-11-2026, 06:46:23 GMT

generalization error, ml-mat-v amp, reviewer, (11 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.42)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

3a5b75ce6cbd3aaaa32d6e935ffc4cff-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-10-2026, 12:00:52 GMT

anchor rate, handwritten4, msrc, (13 more...)

Neural Information Processing Systems

Country:

Asia > China > Shaanxi Province > Xi'an (0.06)
Asia > China > Heilongjiang Province > Harbin (0.05)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.50)

Add feedback

Sparse Convex Biclustering

Jiang, Jiakun, Xiang, Dewei, Gu, Chenliang, Liu, Wei, Wang, Binhuan

arXiv.org Machine LearningJan-6-2026

Biclustering is an essential unsupervised machine learning technique for simultaneously clustering rows and columns of a data matrix, with widespread applications in genomics, transcriptomics, and other high-dimensional omics data. Despite its importance, existing biclustering methods struggle to meet the demands of modern large-scale datasets. The challenges stem from the accumulation of noise in high-dimensional features, the limitations of non-convex optimization formulations, and the computational complexity of identifying meaningful biclusters. These issues often result in reduced accuracy and stability as the size of the dataset increases. To overcome these challenges, we propose Sparse Convex Biclustering (SpaCoBi), a novel method that penalizes noise during the biclustering process to improve both accuracy and robustness. By adopting a convex optimization framework and introducing a stability-based tuning criterion, SpaCoBi achieves an optimal balance between cluster fidelity and sparsity. Comprehensive numerical studies, including simulations and an application to mouse olfactory bulb data, demonstrate that SpaCoBi significantly outperforms state-of-the-art methods in accuracy. These results highlight SpaCoBi as a robust and efficient solution for biclustering in high-dimensional and large-scale datasets.

algorithm, artificial intelligence, machine learning, (15 more...)

arXiv.org Machine Learning

2601.01757

Country: Asia > China (0.46)

Genre: Research Report > Promising Solution (0.54)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.71)

Add feedback

High-Dimensional Optimization in Adaptive Random Subspaces

Neural Information Processing SystemsDec-25-2025, 09:12:43 GMT

We propose a new randomized optimization method for high-dimensional problems which can be seen as a generalization of coordinate descent to random subspaces. We show that an adaptive sampling strategy for the random subspace significantly outperforms the oblivious sampling method, which is the common choice in the recent literature. The adaptive subspace can be efficiently generated by a correlated random matrix ensemble whose statistics mimic the input data. We prove that the improvement in the relative error of the solution can be tightly characterized in terms of the spectrum of the data matrix, and provide probabilistic upper-bounds. We then illustrate the consequences of our theory with data matrices of different spectral decay. Extensive experimental results show that the proposed approach offers significant speed ups in machine learning problems including logistic regression, kernel classification with random convolution layers and shallow neural networks with rectified linear units. Our analysis is based on convex analysis and Fenchel duality, and establishes connections to sketching and randomized matrix decompositions.

artificial intelligence, machine learning, proceedings, (4 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.61)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback